meta state
Curious Causality-Seeking Agents Learn Meta Causal World
Zhao, Zhiyu, Li, Haoxuan, Zhang, Haifeng, Wang, Jun, Faccio, Francesco, Schmidhuber, Jürgen, Yang, Mengyue
When building a world model, a common assumption is that the environment has a single, unchanging underlying causal rule, like applying Newton's laws to every situation. In reality, what appears as a drifting causal mechanism is often the manifestation of a fixed underlying mechanism seen through a narrow observational window. This brings about a problem that, when building a world model, even subtle shifts in policy or environment states can alter the very observed causal mechanisms. In this work, we introduce the \textbf{Meta-Causal Graph} as world models, a minimal unified representation that efficiently encodes the transformation rules governing how causal structures shift across different latent world states. A single Meta-Causal Graph is composed of multiple causal subgraphs, each triggered by meta state, which is in the latent state space. Building on this representation, we introduce a \textbf{Causality-Seeking Agent} whose objectives are to (1) identify the meta states that trigger each subgraph, (2) discover the corresponding causal relationships by agent curiosity-driven intervention policy, and (3) iteratively refine the Meta-Causal Graph through ongoing curiosity-driven exploration and agent experiences. Experiments on both synthetic tasks and a challenging robot arm manipulation task demonstrate that our method robustly captures shifts in causal dynamics and generalizes effectively to previously unseen contexts.
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.77)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Hyper-Meta Reinforcement Learning with Sparse Reward
Hua, Yun, Wang, Xiangfeng, Jin, Bo, Li, Wenhao, Yan, Junchi, He, Xiaofeng, Zha, Hongyuan
Despite their success, existing meta reinforcement learning methods still have difficulty in learning a meta policy effectively for RL problems with sparse reward. To this end, we develop a novel meta reinforcement learning framework, Hyper-Meta RL (HMRL), for sparse reward RL problems. It consists of meta state embedding, meta reward shaping and meta policy learning modules: The cross-environment meta state embedding module constructs a common meta state space to adapt to different environments; The meta state based environment-specific meta reward shaping effectively extends the original sparse reward trajectory by cross-environmental knowledge complementarity; As a consequence, the meta policy then achieves better generalization and efficiency with the shaped meta reward. Experiments with sparse reward show the superiority of HMRL on both transferability and policy learning efficiency.
- Asia > China > Shanghai > Shanghai (0.05)
- North America > United States (0.04)
Identifying Low-Dimensional Structures in Markov Chains: A Nonnegative Matrix Factorization Approach
Ghasemi, Mahsa, Hashemi, Abolfazl, Vikalo, Haris, Topcu, Ufuk
A variety of queries about stochastic systems boil down to study of Markov chains and their properties. If the Markov chain is large, as is typically true for discretized continuous spaces, such analysis may be computationally intractable. Nevertheless, in many scenarios, Markov chains have underlying structural properties that allow them to admit a low-dimensional representation. For instance, the transition matrix associated with the model may be low-rank and hence, representable in a lower-dimensional space. We consider the problem of learning low-dimensional representations for large-scale Markov chains. To that end, we formulate the task of representation learning as that of mapping the state space of the model to a low-dimensional state space, referred to as the kernel space. The kernel space contains a set of meta states which are desired to be representative of only a small subset of original states. To promote this structural property, we constrain the number of nonzero entries of the mappings between the state space and the kernel space. By imposing the desired characteristics of the structured representation, we cast the problem as the task of nonnegative matrix factorization. To compute the solution, we propose an efficient block coordinate gradient descent and theoretically analyze its convergence properties. Our extensive simulation results demonstrate the efficacy of the proposed algorithm in terms of the quality of the low-dimensional representation as well as its computational cost.
- North America > United States > Texas > Travis County > Austin (0.14)
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)